Density Estimation Over Data Stream
نویسندگان
چکیده
Density estimation is an important but costly operation for applications that need to know the distribution of a data set. Moreover, when the data comes as a stream, traditional density estimation methods cannot cope with it efficiently. In this paper, we examined the problem of computing density function over data streams and developed a novel method to solve it. A new concept M-Kernel is used in our algorithm, and it is of the following characteristics: (1) the running time is in linear with the data size, (2) it can keep the whole computing in limited size of memory, (3) its accuracy is comparable to the traditional methods, (4) a useable density model could be available at any time during the processing, (5) it is flexible and can suit with different stream models. Analytical and experimental results showed the efficiency of the proposed algorithm.
منابع مشابه
Density Estimation over Data Streams
A growing number of real-world applications share the property that they have to deal with transient data arriving in massive volumes, so-called data streams. The characteristics of these data streams render their analysis by means of conventional techniques extremely difcult, in the majority of cases even impossible. In fact, to be applicable to data streams, a technique has to meet rigid proc...
متن کاملStream Mining via Density Estimators: A Concrete Application
Many real-world applications share the property that the data they process arrives in streams. The transient and volatile nature of these streams renders the application of common processing and analysis techniques difficult. In particular, the mining of streams has proved to be a difficult task due to the rigid processing requirements that must be met within the data stream scenario. We propos...
متن کاملEstimation of 3D density distribution of chromites deposit using gravity data
We inverse the surface gravity data to recover subsurface 3D density distribution with two strategy. In the first strategy, we assumed wide density model bound for inverting gravity data and In the second strategy, the inversion procedure have been carried out by limited bound density. Wediscretize the earth model into rectangular cells of constant andunidentified density. The number of cells i...
متن کاملContinuous Adaptive Outlier Detection on Distributed Data Streams
In many applications, stream data are too voluminous to be collected in a central fashion and often transmitted on a distributed network. In this paper, we focus on the outlier detection over distributed data streams in real time, firstly, we formalize the problem of outlier detection using the kernel density estimation technique. Then, we adopt the fading strategy to keep pace with the transie...
متن کاملTowards Kernel Density Estimation over Streaming Data
A variety of real-world applications heavily relies on the analysis of transient data streams. Due to the rigid processing requirements of data streams, common analysis techniques as known from data mining are not applicable. A fundamental building block of many data mining and analysis approaches is density estimation. It provides a well-defined estimation of a continuous data distribution, a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002